AITopics | rating score

Collaborating Authors

rating score

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Online ratings influence customer decision-making, yet standard aggregation methods, such as the sample mean, fail to adapt to quality changes over time and ignore review heterogeneity (e.g., review sentiment, a review's helpfulness). To address these challenges, we demonstrate the value of using the Gaussian process (GP) framework for rating aggregation. Specifically, we present a tailored GP model that captures the dynamics of ratings over time while additionally accounting for review heterogeneity. Based on 121,123 ratings from Yelp, we compare the predictive power of different rating aggregation methods in predicting future ratings, thereby finding that the GP model is considerably more accurate and reduces the mean absolute error by 10.2% compared to the sample mean. Our findings have important implications for marketing practitioners and customers. By moving beyond means, designers of online reputation systems can display more informative and adaptive aggregated rating scores that are accurate signals of expected customer satisfaction.

data mining, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

2511.14743

Country:

Europe (0.93)
North America > United States (0.92)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.93)
Information Technology > Services > e-Commerce Services (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(8 more...)

Add feedback

Do Streetscapes Still Matter for Customer Ratings of Eating and Drinking Establishments in Car-Dependent Cities?

Han, Chaeyeon, Lieu, Seung Jae, Hwang, Uijeong, Guhathakurta, Subhrajit

arXiv.org Artificial IntelligenceAug-12-2025

This study examines how indoor and outdoor aesthetics, streetscapes, and neighborhood features shape customer satisfaction at eating and dining establishments (EDEs) across different urban contexts, varying in car dependency, in Washington, DC. Using review photos and street view images, computer vision models quantified perceived safety and visual appeal. Ordinal logistic regression analyzed their effects on Yelp ratings. Findings reveal that both indoor and outdoor environments significantly impact EDE ratings, while streetscape quality's influence diminishes in car-dependent areas. The study highlights the need for context-sensitive planning that integrates indoor and outdoor factors to enhance customer experiences in diverse settings.

artificial intelligence, customer satisfaction, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2508.06513

Country: North America > United States > District of Columbia > Washington (0.24)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Government (0.69)
Banking & Finance (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)

Add feedback

StatLLM: A Dataset for Evaluating the Performance of Large Language Models in Statistical Analysis

Song, Xinyi, Lee, Lina, Xie, Kexin, Liu, Xueying, Deng, Xinwei, Hong, Yili

arXiv.org Artificial IntelligenceFeb-24-2025

The coding capabilities of large language models (LLMs) have opened up new opportunities for automatic statistical analysis in machine learning and data science. However, before their widespread adoption, it is crucial to assess the accuracy of code generated by LLMs. A major challenge in this evaluation lies in the absence of a benchmark dataset for statistical code (e.g., SAS and R). To fill in this gap, this paper introduces StatLLM, an open-source dataset for evaluating the performance of LLMs in statistical analysis. The StatLLM dataset comprises three key components: statistical analysis tasks, LLM-generated SAS code, and human evaluation scores. The first component includes statistical analysis tasks spanning a variety of analyses and datasets, providing problem descriptions, dataset details, and human-verified SAS code. The second component features SAS code generated by ChatGPT 3.5, ChatGPT 4.0, and Llama 3.1 for those tasks. The third component contains evaluation scores from human experts in assessing the correctness, effectiveness, readability, executability, and output accuracy of the LLM-generated code. We also illustrate the unique potential of the established benchmark dataset for (1) evaluating and enhancing natural language processing metrics, (2) assessing and improving LLM performance in statistical coding, and (3) developing and testing of next-generation statistical software - advancements that are crucial for data science and machine learning research.

dataset, evaluation, sas code, (14 more...)

arXiv.org Artificial Intelligence

2502.17657

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Virginia (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(4 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Performance Evaluation of Large Language Models in Statistical Programming

Song, Xinyi, Xie, Kexin, Lee, Lina, Chen, Ruizhe, Clark, Jared M., He, Hao, He, Haoran, Min, Jie, Zhang, Xinlei, Zheng, Simin, Zhang, Zhiyang, Deng, Xinwei, Hong, Yili

arXiv.org Artificial IntelligenceFeb-18-2025

The programming capabilities of large language models (LLMs) have revolutionized automatic code generation and opened new avenues for automatic statistical analysis. However, the validity and quality of these generated codes need to be systematically evaluated before they can be widely adopted. Despite their growing prominence, a comprehensive evaluation of statistical code generated by LLMs remains scarce in the literature. In this paper, we assess the performance of LLMs, including two versions of ChatGPT and one version of Llama, in the domain of SAS programming for statistical analysis. Our study utilizes a set of statistical analysis tasks encompassing diverse statistical topics and datasets. Each task includes a problem description, dataset information, and human-verified SAS code. We conduct a comprehensive assessment of the quality of SAS code generated by LLMs through human expert evaluation based on correctness, effectiveness, readability, executability, and the accuracy of output results. The analysis of rating scores reveals that while LLMs demonstrate usefulness in generating syntactically correct code, they struggle with tasks requiring deep domain understanding and may produce redundant or incorrect results. This study offers valuable insights into the capabilities and limitations of LLMs in statistical programming, providing guidance for future advancements in AI-assisted coding systems for statistical analysis.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.13117

Country:

Europe (0.67)
North America > United States > Florida (0.28)

Genre: Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Grammaticality Representation in ChatGPT as Compared to Linguists and Laypeople

Qiu, Zhuang, Duan, Xufeng, Cai, Zhenguang G.

arXiv.org Artificial IntelligenceJun-16-2024

Large language models (LLMs) have demonstrated exceptional performance across various linguistic tasks. However, it remains uncertain whether LLMs have developed human-like fine-grained grammatical intuition. This preregistered study (https://osf.io/t5nes) presents the first large-scale investigation of ChatGPT's grammatical intuition, building upon a previous study that collected laypeople's grammatical judgments on 148 linguistic phenomena that linguists judged to be grammatical, ungrammatical, or marginally grammatical (Sprouse, Schutze, & Almeida, 2013). Our primary focus was to compare ChatGPT with both laypeople and linguists in the judgement of these linguistic constructions. In Experiment 1, ChatGPT assigned ratings to sentences based on a given reference sentence. Experiment 2 involved rating sentences on a 7-point scale, and Experiment 3 asked ChatGPT to choose the more grammatical sentence from a pair. Overall, our findings demonstrate convergence rates ranging from 73% to 95% between ChatGPT and linguists, with an overall point-estimate of 89%. Significant correlations were also found between ChatGPT and laypeople across all tasks, though the correlation strength varied by task. We attribute these results to the psychometric nature of the judgment tasks and the differences in language processing styles between humans and LLMs.

chatgpt, linguist, participant, (15 more...)

arXiv.org Artificial Intelligence

2406.11116

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Exploring Iterative Enhancement for Improving Learnersourced Multiple-Choice Question Explanations with Large Language Models

Bao, Qiming, Leinonen, Juho, Peng, Alex Yuxuan, Zhong, Wanjun, Gendron, Gaël, Pistotti, Timothy, Huang, Alice, Denny, Paul, Witbrock, Michael, Liu, Jiamou

arXiv.org Artificial IntelligenceJan-19-2024

Large language models exhibit superior capabilities in processing and understanding language, yet their applications in educational contexts remain underexplored. Learnersourcing enhances learning by engaging students in creating their own educational content. When learnersourcing multiple-choice questions, creating explanations for the solution of a question is a crucial step; it helps other students understand the solution and promotes a deeper understanding of related concepts. However, it is often difficult for students to craft effective solution explanations, due to limited subject understanding. To help scaffold the task of automated explanation generation, we present and evaluate a framework called "ILearner-LLM", that iteratively enhances the generated explanations for the given questions with large language models. Comprising an explanation generation model and an explanation evaluation model, the framework generates high-quality student-aligned explanations by iteratively feeding the quality rating score from the evaluation model back into the instruction prompt of the explanation generation model. Experimental results demonstrate the effectiveness of our ILearner-LLM on LLaMA2-13B and GPT-4 to generate higher quality explanations that are closer to those written by students on five PeerWise datasets. Our findings represent a promising path to enrich the learnersourcing experience for students and to enhance the capabilities of large language models for educational applications.

explanation, instruction, rating score, (15 more...)

arXiv.org Artificial Intelligence

2309.10444

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.06)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Massachusetts (0.04)
(2 more...)

Genre: Research Report > New Finding (0.86)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multi-modal Machine Learning for Vehicle Rating Predictions Using Image, Text, and Parametric Data

Su, Hanqi, Song, Binyang, Ahmed, Faez

arXiv.org Artificial IntelligenceMay-27-2023

Accurate vehicle rating prediction can facilitate designing and configuring good vehicles. This prediction allows vehicle designers and manufacturers to optimize and improve their designs in a timely manner, enhance their product performance, and effectively attract consumers. However, most of the existing data-driven methods rely on data from a single mode, e.g., text, image, or parametric data, which results in a limited and incomplete exploration of the available information. These methods lack comprehensive analyses and exploration of data from multiple modes, which probably leads to inaccurate conclusions and hinders progress in this field. To overcome this limitation, we propose a multi-modal learning model for more comprehensive and accurate vehicle rating predictions. Specifically, the model simultaneously learns features from the parametric specifications, text descriptions, and images of vehicles to predict five vehicle rating scores, including the total score, critics score, performance score, safety score, and interior score. We compare the multi-modal learning model to the corresponding unimodal models and find that the multi-modal model's explanatory power is 4% - 12% higher than that of the unimodal models. On this basis, we conduct sensitivity analyses using SHAP to interpret our model and provide design and optimization directions to designers and manufacturers. Our study underscores the importance of the data-driven multi-modal learning approach for vehicle design, evaluation, and optimization. We have made the code publicly available at http://decode.mit.edu/projects/vehicleratings/.

prediction, rating score, vehicle, (14 more...)

arXiv.org Artificial Intelligence

doi: 10.1115/DETC2023-115076

2305.15218

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.34)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
Asia > India (0.04)
Asia > China (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Automobiles & Trucks > Manufacturer (1.00)
Health & Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

Graph Embedding Augmented Skill Rating System

Wang, Jiasheng

arXiv.org Artificial IntelligenceApr-17-2023

This paper presents a framework for learning player embeddings in competitive games and events. Players and their win-loss relationships are modeled as a skill gap graph, which is an undirected weighted graph. The player embeddings are learned from the graph using a random walk-based graph embedding method and can reflect the relative skill levels among players. Embeddings are low-dimensional vector representations that can be conveniently applied to subsequent tasks while still preserving the topological relationships in a graph. In the latter part of this paper, Graphical Elo (GElo) is introduced as an application of player embeddings when rating player skills. GElo is an extension of the classic Elo rating system. It constructs a skill gap graph based on player match histories and learns player embeddings from it. Afterward, the rating scores that were calculated by Elo are adjusted according to player activeness and cosine similarities among player embeddings. GElo can be executed offline and in parallel, and it is non-intrusive to existing rating systems. Experiments on public datasets show that GElo makes a more reliable evaluation of player skill levels than vanilla Elo. The experimental results suggest potential applications of player embeddings in competitive games and events.

artificial intelligence, graph, machine learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TG.2022.3221849

2304.08257

Country: Asia > China > Shanghai > Shanghai (0.04)

Genre: Research Report > New Finding (0.88)

Industry:

Leisure & Entertainment > Games > Computer Games (1.00)
Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Data Science (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.68)

Add feedback